Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            This paper introduces the concept of leakage-robust Bayesian persuasion. Situated between public Bayesian persuasion and private Bayesian persuasion, leakage-robust persuasion considers a setting where one or more signals privately communicated by a sender to the receivers may be leaked. We study the design of leakage-robust Bayesian persuasion schemes and quantify the price of robustness using two formalisms: - The first notion, k-worst-case persuasiveness, requires a signaling scheme to remain persuasive as long as each receiver observes no more than k leaked signals from other receivers. We quantify the Price of Robust Persuasiveness (PoRPk)— i.e., the gap in sender's utility as compared to the optimal private persuasion scheme—as Θ(min{2k,n}) for supermodular sender utilities and Θ(k) for submodular or XOS sender utilities, where n is the number of receivers. This result also establishes that in some instances, Θ(log k) leakages are sufficient for the utility of the optimal leakage-robust persuasion to degenerate to that of public persuasion. - The second notion, expected downstream utility robustness, relaxes the persuasiveness requirement and instead considers the impact on sender's utility resulting from receivers best responding to their observations. By quantifying the Price of Robust Downstream Utility (PoRU) as the gap between the sender's expected utility over the randomness in the leakage pattern as compared to private persuasion, our results show that, over several natural and structured distributions of leakage patterns, PoRU improves PoRP to Θ(k) or even Θ(1), where k is the maximum number of leaked signals observable to each receiver across leakage patterns in the distribution. En route to these results, we show that subsampling and masking serve as general-purpose algorithmic paradigms for transforming any private persuasion signaling scheme to one that is leakage-robust, with minmax optimal loss in sender's utility. A full version of this paper can be found at https://arxiv.org/abs/2411.16624.more » « lessFree, publicly-accessible full text available July 7, 2026
- 
            Collaboration is crucial for reaching collective goals. However, its effectiveness is often undermined by the strategic behavior of individual agents -- a fact that is captured by a high Price of Stability (PoS) in recent literature [Blum et al., 2021]. Implicit in the traditional PoS analysis is the assumption that agents have full knowledge of how their tasks relate to one another. We offer a new perspective on bringing about efficient collaboration among strategic agents using information design. Inspired by the growing importance of collaboration in machine learning (such as platforms for collaborative federated learning and data cooperatives), we propose a framework where the platform has more information about how the agents' tasks relate to each other than the agents themselves. We characterize how and to what degree such platforms can leverage their information advantage to steer strategic agents toward efficient collaboration. Concretely, we consider collaboration networks where each node is a task type held by one agent, and each task benefits from contributions made in their inclusive neighborhood of tasks. This network structure is known to the agents and the platform, but only the platform knows each agent's real location -- from the agents' perspective, their location is determined by a random permutation. We employ private Bayesian persuasion and design two families of persuasive signaling schemes that the platform can use to ensure a small total workload when agents follow the signal. The first family aims to achieve the minmax optimal approximation ratio compared to the optimal collaboration, which is shown to be Θ(n‾√) for unit-weight graphs, Θ(n2/3) for graphs with constant minimum edge weights, and O(n3/4) for general weighted graphs. The second family ensures per-instance strict improvement compared to full information disclosure.more » « lessFree, publicly-accessible full text available January 15, 2026
- 
            Collaboration is crucial for reaching collective goals. However, its effectiveness is often undermined by the strategic behavior of individual agents -- a fact that is captured by a high Price of Stability (PoS) in recent literature [Blum et al., 2021]. Implicit in the traditional PoS analysis is the assumption that agents have full knowledge of how their tasks relate to one another. We offer a new perspective on bringing about efficient collaboration among strategic agents using information design. Inspired by the growing importance of collaboration in machine learning (such as platforms for collaborative federated learning and data cooperatives), we propose a framework where the platform has more information about how the agents' tasks relate to each other than the agents themselves. We characterize how and to what degree such platforms can leverage their information advantage to steer strategic agents toward efficient collaboration. Concretely, we consider collaboration networks where each node is a task type held by one agent, and each task benefits from contributions made in their inclusive neighborhood of tasks. This network structure is known to the agents and the platform, but only the platform knows each agent's real location -- from the agents' perspective, their location is determined by a random permutation. We employ private Bayesian persuasion and design two families of persuasive signaling schemes that the platform can use to ensure a small total workload when agents follow the signal. The first family aims to achieve the minmax optimal approximation ratio compared to the optimal collaboration, which is shown to be Θ(n‾√) for unit-weight graphs, Θ(n2/3) for graphs with constant minimum edge weights, and O(n3/4) for general weighted graphs. The second family ensures per-instance strict improvement compared to full information disclosure.more » « lessFree, publicly-accessible full text available January 15, 2026
- 
            Collaboration is crucial for reaching collective goals. However, its potential for effectiveness is often undermined by the strategic behavior of individual agents — a fact that is captured by a high Price of Stability (PoS) in recent literature [BHPS21]. Implicit in the traditional PoS analysis is the assumption that agents have full knowledge of how their tasks relate to one another. We offer a new perspective on bringing about efficient collaboration across strategic agents using information design. Inspired by the increasingly important role collaboration plays in machine learning (such as platforms for collaborative federated learning and data cooperatives), we propose a framework in which the platform possesses more information about how the agents’ tasks relate to each other than the agents themselves. Our results characterize how and to what degree such platforms can leverage their information advantage and steer strategic agents towards efficient collaboration. Concretely, we consider collaboration networks in which each node represents a task type held by one agent, and each task benefits from contributions made to the task itself and its neighboring tasks. This network structure is known to the agents and the platform. On the other hand, the real location of each agent in the network is known to the platform only — from the perspective of the agents, their location is determined by a uniformly random permutation. We employ the framework of private Bayesian persuasion and design two families of persuasive signaling schemes that the platform can use to guarantee a small total workload when agents follow the signal. The first family aims to achieve the minmax optimal approximation ratio compared to the total workload in the optimal collaboration, which is shown to be for unit-weight graphs, for graphs with edge weights lower bounded by Ω(1), and for general weighted graphs. The second family ensures per-instance strict improvement in the total workload compared to scenarios with full information disclosure.more » « lessFree, publicly-accessible full text available January 12, 2026
- 
            We study calibration measures in a sequential prediction setup. In addition to rewarding accurate predictions (completeness) and penalizing incorrect ones (soundness), an important desideratum of calibration measures is truthfulness, a minimal condition for the forecaster not to be incentivized to exploit the system. Formally, a calibration measure is truthful if the forecaster (approximately) minimizes the expected penalty by predicting the conditional expectation of the next outcome, given the prior distribution of outcomes. We conduct a taxonomy of existing calibration measures. Perhaps surprisingly, all of them are far from being truthful. We introduce a new calibration measure termed the Subsampled Smooth Calibration Error (SSCE), which is complete and sound, and under which truthful prediction is optimal up to a constant multiplicative factor. In contrast, under existing calibration measures, there are simple distributions on which a polylogarithmic (or even zero) penalty is achievable, while truthful prediction leads to a polynomial penalty.more » « lessFree, publicly-accessible full text available December 16, 2025
- 
            When learning in strategic environments, a key question is whether agents can overcome uncertainty about their preferences to achieve outcomes they could have achieved absent any uncertainty. Can they do this solely through interactions with each other? We focus this question on the ability of agents to attain the value of their Stackelberg optimal strategy and study the impact of information asymmetry. We study repeated interactions in fully strategic environments where players' actions are decided based on learning algorithms that take into account their observed histories and knowledge of the game. We study the pure Nash equilibria (PNE) of a meta-game where players choose these algorithms as their actions. We demonstrate that if one player has perfect knowledge about the game, then any initial informational gap persists. That is, while there is always a PNE in which the informed agent achieves her Stackelberg value, there is a game where no PNE of the meta-game allows the partially informed player to achieve her Stackelberg value. On the other hand, if both players start with some uncertainty about the game, the quality of information alone does not determine which agent can achieve her Stackelberg value. In this case, the concept of information asymmetry becomes nuanced and depends on the game's structure. Overall, our findings suggest that repeated strategic interactions alone cannot facilitate learning effectively enough to earn an uninformed player her Stackelberg value.more » « lessFree, publicly-accessible full text available December 16, 2025
- 
            We study calibration measures in a sequential prediction setup. In addition to rewarding accurate predictions (completeness) and penalizing incorrect ones (soundness), an important desideratum of calibration measures is truthfulness, a minimal condition for the forecaster not to be incentivized to exploit the system. Formally, a calibration measure is truthful if the forecaster (approximately) minimizes the expected penalty by predicting the conditional expectation of the next outcome, given the prior distribution of outcomes. We conduct a taxonomy of existing calibration measures. Perhaps surprisingly, all of them are far from being truthful. We introduce a new calibration measure termed the Subsampled Smooth Calibration Error (SSCE), which is complete and sound, and under which truthful prediction is optimal up to a constant multiplicative factor. In contrast, under existing calibration measures, there are simple distributions on which a polylogarithmic (or even zero) penalty is achievable, while truthful prediction leads to a polynomial penalty.more » « lessFree, publicly-accessible full text available December 16, 2025
- 
            When learning in strategic environments, a key question is whether agents can overcome uncertainty about their preferences to achieve outcomes they could have achieved absent any uncertainty. Can they do this solely through interactions with each other? We focus this question on the ability of agents to attain the value of their Stackelberg optimal strategy and study the impact of information asymmetry. We study repeated interactions in fully strategic environments where players' actions are decided based on learning algorithms that take into account their observed histories and knowledge of the game. We study the pure Nash equilibria (PNE) of a meta-game where players choose these algorithms as their actions. We demonstrate that if one player has perfect knowledge about the game, then any initial informational gap persists. That is, while there is always a PNE in which the informed agent achieves her Stackelberg value, there is a game where no PNE of the meta-game allows the partially informed player to achieve her Stackelberg value. On the other hand, if both players start with some uncertainty about the game, the quality of information alone does not determine which agent can achieve her Stackelberg value. In this case, the concept of information asymmetry becomes nuanced and depends on the game's structure. Overall, our findings suggest that repeated strategic interactions alone cannot facilitate learning effectively enough to earn an uninformed player her Stackelberg value.more » « lessFree, publicly-accessible full text available December 16, 2025
- 
            We prove novel algorithmic guarantees for several online problems in the smoothed analysis model. In this model, at each time step an adversary chooses an input distribution with density function bounded above pointwise by \(\tfrac{1}{\sigma }\)times that of the uniform distribution; nature then samples an input from this distribution. Here, σ is a parameter that interpolates between the extremes of worst-case and average case analysis. Crucially, our results hold foradaptiveadversaries that can base their choice of input distribution on the decisions of the algorithm and the realizations of the inputs in the previous time steps. An adaptive adversary can nontrivially correlate inputs at different time steps with each other and with the algorithm’s current state; this appears to rule out the standard proof approaches in smoothed analysis. This paper presents a general technique for proving smoothed algorithmic guarantees against adaptive adversaries, in effect reducing the setting of an adaptive adversary to the much simpler case of an oblivious adversary (i.e., an adversary that commits in advance to the entire sequence of input distributions). We apply this technique to prove strong smoothed guarantees for three different problems:(1)Online learning: We consider the online prediction problem, where instances are generated from an adaptive sequence of σ-smooth distributions and the hypothesis class has VC dimensiond. We bound the regret by\(\tilde{O}(\sqrt {T d\ln (1/\sigma)} + d\ln (T/\sigma))\)and provide a near-matching lower bound. Our result shows that under smoothed analysis, learnability against adaptive adversaries is characterized by the finiteness of the VC dimension. This is as opposed to the worst-case analysis, where online learnability is characterized by Littlestone dimension (which is infinite even in the extremely restricted case of one-dimensional threshold functions). Our results fully answer an open question of Rakhlin et al. [64].(2)Online discrepancy minimization: We consider the setting of the online Komlós problem, where the input is generated from an adaptive sequence of σ-smooth and isotropic distributions on the ℓ2unit ball. We bound the ℓ∞norm of the discrepancy vector by\(\tilde{O}(\ln ^2(\frac{nT}{\sigma }))\). This is as opposed to the worst-case analysis, where the tight discrepancy bound is\(\Theta (\sqrt {T/n})\). We show such\(\mathrm{polylog}(nT/\sigma)\)discrepancy guarantees are not achievable for non-isotropic σ-smooth distributions.(3)Dispersion in online optimization: We consider online optimization with piecewise Lipschitz functions where functions with ℓ discontinuities are chosen by a smoothed adaptive adversary and show that the resulting sequence is\(({\sigma }/{\sqrt {T\ell }}, \tilde{O}(\sqrt {T\ell }))\)-dispersed. That is, every ball of radius\({\sigma }/{\sqrt {T\ell }}\)is split by\(\tilde{O}(\sqrt {T\ell })\)of the partitions made by these functions. This result matches the dispersion parameters of Balcan et al. [13] for oblivious smooth adversaries, up to logarithmic factors. On the other hand, worst-case sequences are trivially (0,T)-dispersed.1more » « less
- 
            Black-box finetuning is an emerging interface for adapting state-of-the-art language models to user needs. However, such access may also let malicious actors undermine model safety. To demonstrate the challenge of defending finetuning interfaces, we introduce covert malicious finetuning, a method to compromise model safety via finetuning while evading detection. Our method constructs a malicious dataset where every individual datapoint appears innocuous, but finetuning on the dataset teaches the model to respond to encoded harmful requests with encoded harmful responses. Applied to GPT-4, our method produces a finetuned model that acts on harmful instructions 99% of the time and avoids detection by defense mechanisms such as dataset inspection, safety evaluations, and input/output classifiers. Our findings question whether black-box finetuning access can be secured against sophisticated adversaries.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available